This project presents a deep learning-based real-time framework for detecting insider threats using a hybrid model that integrates sequence modeling and relational learning. The system analyzes user activity data dynamically and predicts potential insider threats without human intervention. Leveraging Long Short-Term Memory (LSTM) networks for user behavior sequence analysis and Graph Neural Networks (GNNs) for peer-context enrichment, the framework accurately identifies anomalies at the activity level. Each user action is encoded, evaluated against similar activities in the organization, and classified based on anomaly scores. Using the CERT insider threat dataset, the system is evaluated with precision, recall, and F1-score metrics. A visualization dashboard supports real-time monitoring and alerting for security analysts. This project enhances the ability to proactively detect and respond to insider threats across various organizational environments.
Introduction
The project addresses insider threat detection by proposing a hybrid AI framework that combines Long Short-Term Memory (LSTM) networks and Graph Neural Networks (GNNs) to model both individual user behavior sequences and contextual peer relationships. This approach improves detection accuracy by identifying subtle anomalies in real-time user activity data.
Key Components:
LSTM captures temporal user behavior patterns and predicts future actions.
GNN models relationships between similar activities, enhancing anomaly detection through peer context.
FAISS is used for fast nearest neighbor retrieval to efficiently build dynamic local graphs.
The system is trained and tested on the realistic CERT insider threat dataset, using precision, recall, and F1-score as key metrics. The hybrid LSTM-GNN model outperforms traditional sequence-only or rule-based methods, offering scalable, responsive, and accurate insider threat detection.
Additional Highlights:
Competitive learning selects the most relevant historical behaviors to enhance graph construction.
System optimization includes asynchronous processing and edge computing for low latency.
Comprehensive preprocessing ensures high-quality input from diverse features like activity logs, user roles, and personality traits.
A proposed monitoring dashboard supports cybersecurity analysts with real-time alerts.
Conclusion
Based on the experimental evaluation of the insider threat detection system, it can be concluded that the graph-based anomaly detection approach, combined with concept drift handling, effectively identifies key insider threat events in real-time with high precision. The model consistently achieved high recall scores across all tested events—Unauthorized Access, Data Exfiltration, and Privilege Escalation—demonstrating its robust ability to detect relevant malicious activities.
Among the evaluated events, the Unauthorized Access category exhibited the best performance, with an F1-Score of 0.87, followed by Data Exfiltration with 0.83, and Privilege Escalation with 0.83. The consistently high Recall (ranging from 0.90 to 0.95) in all categories highlights the system’s strength in capturing all relevant instances, while Precision values between 0.76 and 0.80 demonstrate reliable threat detection with minimal false positives.
These findings validate the capability of graph-based anomaly detection in identifying insider threats in real-time, even as concept drift occurs over time. The approach proves to be scalable for detecting insider threats across different network environments and can be further enhanced with additional data sources, deep learning models, and advanced anomaly detection techniques in future work.
References
[1] Taher Al-Shehari, Rakan A. Alsowail, \"An Insider Data Leakage Detection Using One-Hot Encoding, Synthetic Minority Oversampling and Machine Learning Techniques\", Entropy, MDPI, Switzerland, 2021, 1258.
[2] Balaram Sharma, Prabhat Pokharel, Basanta Joshi, \"User Behavior Analytics for Anomaly Detection Using LSTM Autoencoder: Insider Threat Detection\", Proceedings of the International Conference on Advances in Information Technology (IAIT2020), 2020.
[3] S. Wang, Z. Wang, T. Zhou, H. Sun, X. Yin, D. Han, H. Zhang, X. Shi, and J. Yang, “Threatrace: Detecting and tracing host-based threats in node level through provenance graph learning,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 3972–3987, 2022.
[4] C. Wang and H. Zhu, “Wrongdoing monitor: A graph-based behavioral anomaly detection in cyber security,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 2703–2718, 2022.
[5] X. Hu, W. Gao, G. Cheng, R. Li, Y. Zhou, and H. Wu, “Towards early and accurate network intrusion detection using graph embedding,” IEEE Transactions on Information Forensics and Security, 2023.
[6] W. Huang, H. Zhu, C. Li, Q. Lv, Y. Wang, and H. Yang, “Itdbert: Temporal-semantic representation for insider threat detection,” in 2021 IEEE Symposium on Computers and Communications (ISCC). IEEE, 2021, pp. 1–7.
[7] H. Ding, Y. Sun, N. Huang, Z. Shen, and X. Cui, “Tmg-gan: Generative adversarial networks-based imbalanced learning for network intrusion detection,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 1156–1167, 2023.
[8] S. Yuan, P. Zheng, X. Wu, and H. Tong, “Few-shot insider threat detection,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 2289–2292.
[9] M. AlSlaiman, M. I. Salman, M. M. Saleh, and B. Wang, “Enhancing false negative and positive rates for efficient insider threat detection,” Computers & Security, vol. 126, p. 103066, 2023.
[10] S. Yuan and X. Wu, “Deep learning for insider threat detection: Review, challenges and opportunities,” Computers & Security, vol. 104, p. 102221, 2021.
[11] B. Peng, E. Alcaide, Q. Anthony, A. Albalak, S. Arcadinho, H. Cao, X. Cheng, M. Chung, M. Grella, K. K. GV et al., “Rwkv: Reinventing rnns for the transformer era,” arXiv preprint arXiv:2305.13048, 2023.
[12] K. Zhou, H. Yu, W. X. Zhao, and J.-R. Wen, “Filter-enhanced mlp is all you need for sequential recommendation,” in Proceedings of the ACM web conference 2022, 2022, pp. 2388–2399.
[13] D. C. Le and N. Zincir-Heywood, “Anomaly detection for insider threats using unsupervised ensembles,” IEEE Transactions on Network and Service Management, vol. 18, no. 2, pp. 1152–1164, 2021.
[14] J. L. Elman, “Finding structure in time,” Cognitive science, vol. 14,no. 2, pp. 179–211, 1990.
[15] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprintarXiv:1412.3555, 2014.
[16] M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, 2017, pp. 1285–1298.
[17] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, ?. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.